Best of Shareware

home *** CD-ROM | disk | FTP | other *** search

/ Best of Shareware / Best of PC Windows Shareware 1.0 - Wayzata Technology (7111) (1993).iso / mac / ZIPPED / DOS / PROGRAMG / GRAD.ZIP / PAPER.EPS < prev next >

Wrap

Text File | 1990-12-30 | 12KB | 430 lines

A SYSTEM FOR THE AUTOMATIC DIFFERENTIATION OF FORTRAN PROGRAMS Oscar Garcia ' Forest Research Institute Rotorua, New Zealand ABSTRACT A general-purpose automatic differentiation system is described. Given a Fortran subprogram for computing a function, it generates a subprogram that computes partial derivatives. The APL computer language was used in the implementation. INTRODUCTION In an effort to speed-up the estimation of parameters in complex forest growth models, an automatic differentiation system was developed (Garcia 1989, 1991). It takes as input a ' Fortran program unit (usually a subroutine or function subprogram) that computes the value of a function, and produces as output another Fortran program unit that computes partial derivatives with respect to specified independent variables. I describe here the methods and implementation, and give some figures on the performance of the generated code. Additional discussion and evaluation of results can be found in Garcia ' (1991). Performance appears satisfactory, and compares favorably with that of JAKEF (Hillstrom 1990). The system may be useful in other applications. ____________________________ Presented at the SIAM Workshop on Automatic Differentiation of Algorithms, Breckenridge, Colorado, January 6-8, 1991. 2 PRINCIPLES AND IMPLEMENTATION The basic idea was inspired by Wengert (1964). He suggested substituting for each operator a routine, that in addition to compute the result of the operation would also produce the values of the derivatives. A somewhat similar approach was implemented by hand in order to compute first and second partial derivatives in a parameter estimation program (Garcia ' 1983). Each statement in the function evaluation routine was followed by statements to compute the derivatives of the left- hand-side variable (usually an intermediate variable). For example, C = A * B would be followed by C1 = A1 * B + A * B1 C2 = A2 * B + A * B2 etc., where the Ai and Bi are partial derivatives with respect to the i-th independent variable, computed in previous statements (with obvious simplifications where the partials for some terms do not exist). The function derivatives sought follow the statement defining the final value of the function, usually at the end of the subroutine. Unlike Wengert's, this method produces stand-alone code, without the overhead of calls to routines in a run-time package. The system described here essentially automates this procedure. The statements that compute derivatives precede instead of follow each function evaluation statement, to handle correctly statements such as A = A * B . For each assignment statement, the right-hand-side expression is processed by a recursive descent parser, and Fortran code for computing the relevant derivatives is generated. Unnecessary parenthesis are avoided as much as possible. Thus, most of the redundant operations generated will be removed by any optimizing compiler. An earlier version produced fully parenthesized expressions that would inhibit optimization, since in Fortran parenthesis fix the order of operations. Identifiers for the derivatives are formed by appending to the variable name an optional user-defined delimiter character and the number of the independent variable. Identifiers exceeding 3 the 6-character Fortran standard maximum length might need to be changed by the user (a warning is produced). Many Fortran compilers, however, have the option of accepting long variable names (e.g. using the /4Nt switch in Microsoft Fortran). The existence of non-zero derivatives of left-hand-side temporary variables with respect to the independent variables is recorded in an internal data structure as the statements are processed. Initially, new statements are generated only for non-zero derivatives. In some instances this would not produce correct results, and multiple passes may be needed to initialize at zero some of the derivatives. For example, with one pass the following statements would generate incorrect code (X is the first independent variable and this is the first appearance of A; derivatives of VAR with respect to the k-th independent variable are identified as VAR_k): IF (I .LT. 5) GO TO 10 IF (I .LT. 5) GO TO 10 A = 0. A = 0. GO TO 20 GO TO 20 10 CONTINUE 10 CONTINUE A = X ══> A_1 = 1. 20 CONTINUE A = X B = A + 1 20 CONTINUE B_1 = A_1 B = A + 1 Here A_1 = 0.0 would be needed following the IF statement. Similar problems may arise from loops: A = 0. B = 0. DO 10 I = 1,N A = A + B B = C(I) * X 10 CONTINUE These cases are handled by making multiple passes over the original code. The need for another pass is signaled if a new non-zero derivative is encountered for a variable that had been previously used on the left-hand-side of an assignment. Keeping the accumulated record of non-zero derivatives causes that new derivative to be initialized at zero in the appropriate place in the next pass. A result of the myopic strategy of examining only assignment statements, one at a time, was a relatively simple and efficient system. However, a few restrictions must be observed in the source coding: 4 a) The independent variables must be scalars or array elements with constant subscripts. b) Assignments involving independent variables directly or indirectly are not allowed in IF or other control statements. c) Labels are not allowed in assignment statements. Use CONTINUE. d) User defined functions involving the independent variables are not supported. In addition, the header of the output routine must be edited manually to change its name and to include the derivatives as arguments. It may also be necessary to declare the new variables if default types or IMPLICIT declarations are not used. The procedures have been used only with Fortran 66, and designed with this dialect in mind. It is conceivable that some additional restrictions could be needed with Fortran 77. Second or higher derivatives can be generated simply by running the output through the differentiator again. Different delimiter characters for the identifiers must be used in each run. The resulting code, however, will perform redundant computations, since (for second derivatives) two versions of the gradient and of the off-diagonal Hessian elements are generated. It would not be difficult to eliminate this redundancy in a post-processing step, but this has not been implemented yet. The system was written in APL, using STSC's APL*PLUS interpreter for IBM PC or compatible microcomputers (STSC 1989). This is transparent to the user, however, and no knowledge of APL is required to use it. The high level and power of the APL computer language made possible to complete the implementation in a very short time. The system also runs without modification with the inexpensive Pocket APL (Turner 1985), and with other APL interpreters from STSC available for several microcomputers and mainframes. In order to make it more accessible, a version for the public domain I-APL interpreter has been prepared. Although all comments were removed to make it fit into the less than 30 K bytes of memory available under I-APL, this version cannot process very large problems, and is about 27 times slower than the APL*PLUS version. 5 PERFORMANCE The run-time performance of the generated code may be assessed by the work ratio (Griewank 1988), the ratio of the time required to compute a gradient to the time taken by a function evaluation. Note that the computation of the gradient produces also a function value, which in many applications is also needed. A forward differences approximation of the gradient for a function with n independent variables requires n + 1 function evaluations, that is, a work ratio of n + 1. The work ratio for central differences is 2n, or 2n + 1 if a function evaluation at the central point is included. The following table shows work ratios for several examples. Work ratios for another automatic differentiator, JAKEF (Hillstrom 1990), are also included. The tests were carried out on AT-compatible microcomputers with floating point coprocessors. Problem Variables Work ratio JAKEF 1 3 3.1 11.2 2 4 2.7 11.1 3 9 4.0 - 4 16 5.6 - 5 18 5.2 8.0 6 18 6.4 - Problems 1 and 2 are two short subroutines used for testing optimization algorithms. Problems 3 to 6 are large subroutines used for parameter estimation, with about 130 to 180 Fortran statements, and containing loops running over large numbers of observations. JAKEF uses a reverse differentiation approach (Griewank 1988), with storage requirements increasing with the number of statements executed. Only problem 5, where the number of observations was limited to 100, could be solved with JAKEF in the available memory (Garcia 1991). ' REFERENCES Garcia, O. 1983. A stochastic differential equation model for ' the height growth of forest stands. EBiometrics 39F, 1059- 1072. 6 Garcia, O. 1989. Growth Modelling - New developments. In ' Nagumo, H. and Konohira, Y. (Ed.) Japan and New Zealand Symposium on Forestry Management Planning. Japan Association for Forestry Statistics. Garcia, O. 1991. Experience with automatic differentiation in ' estimating forest growth model parameters. Presented at the SIAM Workshop on Automatic Differentiation of Algorithms, Breckenridge, Colorado, January 6-8, 1991. Griewank, A. 1988. On automatic differentiation. Argonne National Laboratory, Mathematics and Computer Science Division. Preprint ANL/MCS-P10-1088. Hilstrom, K. E. 1990. User guide for JAKEF. (Available in electronic form from netlib on Internet). STSC 1989. APL*PLUS system for the PC - User's manual. STSC, Inc. Turner, J. R. 1985. Pocket APL - Reference guide. STSC, Inc., Rockville, Maryland. Wengert, R.E. 1964. A simple automatic derivative evaluation program. ECommunications of the ACM 7F, 463-471. 7 APPENDIX FORWARD AND REVERSE DIFFERENTIATION (Omitted)